Extending the punctuation module for european portuguese

نویسندگان

  • Fernando Batista
  • Helena Moniz
  • Isabel Trancoso
  • Hugo Meinedo
  • Ana Isabel Mata
  • Nuno J. Mamede
چکیده

This paper describes our recent work on extending the punctuation module of automatic subtitles for Portuguese Broadcast News. The main improvement was achieved by the use of prosodic information. This enabled the extension of the previous module which covered only full stops and commas, to cover question marks as well. The approach uses lexical, acoustic and prosodic information. Our results show that the latter is relevant for all types of punctuation. An analysis of the results also shows what type of interrogative is better dealt with by our method, taking into account the specificities of Portuguese. This may lead to different results for different types of corpora, depending on the types of interrogatives that are more frequent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recovering capitalization and punctuation marks for automatic speech recognition: Case study for Portuguese broadcast news

The following material presents a study about recovering punctuation marks, and capitalization information from European Portuguese broadcast news speech transcriptions. Different approaches were tested for capitalization, both generative and discriminative, using: finite state transducers automatically built from language models; and maximum entropy models. Several resources were used, includi...

متن کامل

Modules whose direct summands are FI-extending

‎A module $M$ is called FI-extending if every fully invariant submodule of $M$ is essential in a direct summand of $M$‎. ‎It is not known whether a direct summand of an FI-extending module is also FI-extending‎. ‎In this study‎, ‎it is given some answers to the question that under what conditions a direct summand of an FI-extending module is an FI-extending module?

متن کامل

$PI$-extending modules via nontrivial complex bundles and Abelian endomorphism rings

A module is said to be $PI$-extending provided that every projection invariant submodule is essential in a direct summand of the module. In this paper, we focus on direct summands and indecomposable decompositions of $PI$-extending modules. To this end, we provide several counter examples including the tangent bundles of complex spheres of dimensions bigger than or equal to 5 and certain hyper ...

متن کامل

A relative extending module and torsion precovers

We first characterize $tau$-complemented modules with relative (pre)-covers. We also introduce an extending module relative to $tau$-pure submodules on a hereditary torsion theory $tau$ and give its relationship with $tau$-complemented modules.

متن کامل

Recovering Capitalization and Punctuation Marks on Speech Transcriptions

This work addresses two metadata annotation tasks, involved in the production of rich transcripts: automatic capitalization, and punctuation marks recovery. The main focus concerns broadcast news, using both manual and automatic speech transcripts. Different capitalization models were analysed and compared, and results support the ideia that generative approaches capture the structure of writte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010